A REMARK ON PRIMALITY TESTING AND DECIMAL 

EXPANSIONS 



TERENCE TAO 



Abstract. We show that for any fixed base a, a positive proportion of primes 
have the property that they become composite after altering any one of their 
digits in the base a expansion; the case a = 2 was already established by Cohen- 
Selfridge and Sun. using some covering congruence ideas of Erdos. Our method 
is slightly different, using a partially covering set of congruences followed by 
an application of the Selberg sieve upper bound. As a consequence, it is not 
always possible to test whether a number is prime from its base a expansion 
without reading all of its digits. We also present some slight generalisations of 
these results. 



1. Introduction 

In 1950, Erdos [6] used the method of covering congruences to show that there exists 
an infinite arithmetic progression of odd integers to with the property that |m — 2 l \ 
is composite for every i. Modifying this method, Cohen and Selfridge [3] exhibited 
an arithmetic progression of odd integers to such that |m — 2*| and m + 2 l are 
both composite for every i. In [21], Sun gave the explicit arithmetic progression 
{to : to = M mod Ilpe-p P} with this property, where 

M := 47867742232066880047611079 

and V is the finite set of primes 

V := {2, 3, 5, 7, 11, 13, 17, 19, 31, 37, 41, 61, 73, 97, 109, 151, 241, 257, 231}, 

and noted that integers in this progression are in fact not of the form ±p a ± q b for 
any primes p, q and positive integers a, b. Since M is coprime to Ilpgp we can 
apply the prime number theorem in arithmetic progressions (see e.g. [14, Corollary 
11.17]) to obtain the following immediate corollary: 

Corollary 1.1. [3], [21] For all sufficiently large integers n, there exist at least 
c2 n /n primes p between 2 n ~ 1 and 2™ such that the integers p — 2 l and p + 2 l are 
composite for every < i < n — 1, where c > is an absolute constant. 



We remark that primes p of the above form are initially rather rare; the first few 
primes of this form are 

1973, 3181, 3967, 4889, 8363, 8923, 11437, 12517, 14489, .... 
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On the other hand, from Corollary 1.1 and the prime number theorem we see that 
a positive proportion of the primes in fact lie on this sequence. 

As an immediate corollary of Corollary 1.1, we see that for sufficiently large n, 
there exist n-bit integers p which are prime, but such that any number formed 
from p by switching one of the bits is not prime; the first few primes of this form 
are 127, 173, 191, 223, 233, ... (a slight variant of sequence A065092 in [19], which is 
the subsequence in which p+2 n+1 is also required to be composite). In other words, 
if we let P n : {0, 1}™ — > {0, 1} be the boolean function which returns 1 if and only 
if the n-bit integer corresponding to the input {0, 1}™ is prime, then the sensitivity 
s(P n ) of P n is equal to n for sufficiently large n. Recall that the sensitivity (or 
critical complexity) s(B) of a Boolean function B : {0, 1}™ — > {0, 1} is the largest 
integer s for which there exists an input x e {0, 1}" such that B(x) ^ B(x') for at 
least s inputs x' which are formed from x by switching exactly one bit. We remark 
that the lower bound s{P n ) > jn + 0(l) was previously established in [20, p. 307]. 

If p is as above, then clearly it is not possible for an algorithm to determine with 
absolute certainty whether p is prime or not without inspecting all of the digits in 
the binary expansion. In particular, any deterministic primality tester can require 
computational time at least logarithmic in the size of the number being tested, if 
that number is represented in binary. For comparison, it was shown in [5, Theorem 
6] that any recursive algorithm which can decide the primality of an n-bit integer 
using the operations =, <, +, — , 2-, i-, and parity, has time complexity at least 
\n. We remark that for bounded depth circuits, much stronger lower bounds (of 
exponential type in n) on the spatial complexity are known; see [1], [22]. 

In this note we establish a similar result for general bases. More precisely, we 
establish 

Theorem 1.2. Let K > 1 be an integer. Then for all sufficiently large N , the 
number of primes p between N and (1 + t<)N such that \kp ± ja l \ is composite 
for all integers 1 < a,j,k < K and 1 < i < KlogN is at least ck l( ^ N for some 
constant ck > depending only on K. 

From this theorem we see that the above results for binary expansions are also 
valid in other bases as well. For instance, applying this theorem with K = 10 we 
conclude that a positive proportion of the primes have the property that if one 
changes any one of the digits in the base 10 expansion, one necessarily obtains a 
composite number, and so any deterministic primality tester receiving the digits of 
this number as input must read all of these digits in order to determine its primality. 
The first few such primes are 294001, 505447, 584141, . . . (sequence A050249 from 
[19]). The infinitude of this sequence was established previously by Erdos [16]. 

Our argument does not use a fully covering set of congruences. Instead, one uses 
congruences modulo primes arising from Mersenne-type numbers (in which bases 
such as a have an unexpectedly low order) to sieve out most of the quadruples 
(a, j, k, i) appearing in the above theorem, leaving behind a small number which 
can be handled via standard upper bound sieves. It seems to be difficult to establish 
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this result without such a preliminary sieving step, since without such a sieving one 
would expect each \kp±ja l \ to be prime with probability comparable to lo g N , 
which makes it moderately unlikely (especially for large K) that the \kp± ja l \ are 
composite for all 1 < a,j, k < K and 1 < i < KlogN for any given prime p. 

The author is supported by NSF grant CCF-0649473 and a grant from the MacArthur 
Foundation. The author is indebted to Yiannis Moschovakis for suggesting this 
question, and to Jens Kruse Andersen, Yong-Gao Chen, Bjorn Pooncn, Florian 
Luca, Paul Pollack, Igor Shparlinski, Zhi-Wei Sun, and several anonymous com- 
mcnters on my blog for helpful comments and references. 



2. Proof of Theorem 1.2 



We now prove Theorem 1.2. Fix K. We will need a large integer M = M(K) > K 
to be chosen later. We will then use this integer M to generate a finite set V of 
primes, as follows: 

Lemma 2.1. For any M, K > I, there exists a finite set V of primes which can be 
partitioned into disjoint sets V — {J 2 < a <K ^o> w ^ the following properties: 



• If p &P a for some 2 < a < K , then there exists a prime q p such that 

q p > Mp (1) 

and 

a? — 1 mod q p . (2) 

Furthermore, the primes q p for p e V are all distinct. 

• For each 2 < a < K, we have 

£i>M. (3) 

peVa y 



Proof The claim is trivial for K = 1, so assume inductively that K > 2 and that 
the claim has already been proven for K — 1. Thus we already have disjoint finite 
sets of primes Vi, . . . 7 Vk~i with the stated properties. 

Let W denote the product of all the numbers less than K which are coprime to 
K, and let A denote the multiplicative order of K mod W K . Observe that if p is 
a prime with p = 1 mod A, then K p — 1 = K — 1 mod W K . In particular, if q is 
any prime less than K, then q can divide K p — 1 at most K times (since K — 1 is 
not a multiple of q K , being a smaller integer). As a consequence, we see that if p 
is larger than some sufficiently large constant Ck , then the largest prime factor of 
K p — 1 is greater than K. 

By the prime number theorem in arithmetic progressions (see e.g. [14, Corollary 
11.17]), the sum of reciprocals of primes equal to p = 1 mod A is divergent. From 
this and Corollary A. 3 we may find a infinite collection of primes V' of primes 
p = 1 mod A which are larger than Ck, disjoint from the finite sets Vi, . . . , Vk-i, 
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such that J2 p<£ -p, p- = oo, and such that mp + 1 is composite for every 1 < m < M. 
For any p in V' , we set q p to be the largest prime factor of K p — 1. Since p > Ck, 
we have q p > K. In particular, the multiplicative order of K mod q p is exactly p, 
which forces all the q p to be distinct. In particular, we can find a finite subset Vk 
of V' with J2 P ev K p — ^ sucn ^ e vames °f % f° r P e are distinct from 
all the values of q p already assigned to p in V\, . . . , Vk-i- 

From Fermat's little theorem we see that p divides q p — 1 for all p G Vk- On the 
other hand, we have mp + 1 composite for every 1 < m < M. Thus q p > Mp as 
required. Thus V := V\ U . . . U "Pk obeys all the desired properties. ■ 

Remark 2.2. In [7] it is shown that the largest prime factor of 2 P — 1 is at least 
cplogp for some absolute constant c > (see also [15] for additional refinements 
and further discussion). Slightly weaker results for more general bases can be found 
in [12]. By using these results one can avoid the use of Corollary A. 3. 

Henceforth we let V = V2 U . . . U Vk, as well as the primes q p for p e P be as in 
the above lemma. 

We let N be a sufficiently large integer parameter. We use the asymptotic notation 
o(l) to denote any quantity that goes to zero as N — > 00 (with K, M, and V 
fixed), and similarly X <C Y or X = 0(Y) to denote the estimate X < CY for 
some C depending on K but independent of N, M, V. We also write X <~ Y for 

By reducing the sets V a if necessary, we may assume from (3) that 

W~M. (4) 

peVa F 

Let S denote the finite set of pairs 

S := {(i, k)EZ 2 :-K<j<K;l<k<K;j^ 0}. 

By (3) and a simple greedy argument, we may partition V a = U(j ■ k)es 7~' a ,i, k m 
such a way that 



E 1 

^— ' p 



M (5) 



for all 2 < a < K and (j, fc) E S. 
Let IF be the quantity 

W := J] 

By the Chinese remainder theorem, we can find b coprime to W such that kb+j = 
mod q p forp G ~Pa,j,k, 2 < a < K, and (j, fc) G S 1 . (Note from (1) and the hypothesis 
M > K that all integers between 1 and K are coprime to IF.) 
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To establish Theorem 1.2, it will suffice to show that the quantity 

#{N < m < (1 + ■^rp)N : m = b mod W; m prime, but 
K 

| fern + jd 1 1 composite for all < i < K log N, 1 < a < K, (j, k) e S} (6) 

is N/logN. Note that when a = 1, the value of i is irrelevant (and so can be 
set for instance to zero). We can thus crudely bound (6) from below by 

K 

(6)>Qn-^2 E Qx. E QN,o,i,j,k ~ O(logTV) 

a=2(j,k)eS0<i<K log N (j,k)eS (7) 

where 

Q N : = #{iV < m < (1 + ^)JV : m = 6 mod W} 

and 

Qn iajk '■— #{^ < m < (1 + -7)-^ : m = b mod W;m, \km± ja l \ both prime }. 

K 

(The O(logiV) error arises from the small number of cases in which \km + ja l \ is 
equal to zero or one.) 

From the prime number theorem in arithmetic progressions (see e.g. [14, Corollary 
11.17]) we have 

N 

yw>> 0(WOlog7V 

where 

<f>(W) = W]J(l--) 
P ev q P 

is the Euler totient function of W. (More precise asymptotics for Qn are available, 
but we will not need them here.) 

From Corollary A. 2 we have 

for all 1 < a < K , < i < K log N, and (j, k) £ S. Applying this to dispose of the 
<3iv,o,i,j,fc terms in (7), we thus conclude that 

N 1 K 

(6)>> ^ioiAn(i--r 1 -o(E E E G".w,*) (9) 

to PEP ^ P a=2 (j,fc)eSO<i<KlogAf 

when iV is sufficiently large. 

Now suppose that 2 < a < K and (j, k) e 5. Observe that if i = mod p for any 
P € V a ,j.k, then | fern + ja 1 1 is divisible by g p , and thus will prime for at most one 
value of m. Thus (paying a negligible factor of 0(log N)) we may restrict attention 
to those < i < n — 1 such that i ^ mod p for every p e V a ,j,k- By the Chinese 
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remainder theorem, we see that the number of such i is 0(log TV Jlpe-p ■ fc (l — p))- 
Using the approximations 

neA n£A 

which are valid for any finite set A (since X^neA W = ^(-0)' we conclude from the 
above discussion and (8) that 



V- ^ N 

2 / QN,i,a,j,k <C 



0<j<iflogJV yK 1U & 



But from (3), (1) we have J2 P ev j; = 0(1), while from (5) we have J2 P ev a , k p > 
M. Inserting all these bounds into (9), we conclude 

(6)»^(i-o (e x P( -cM))) (na-^r 1 ) 

where c > depends on X but not on M. Taking M sufficiently large depending 
on K, we obtain the claim. 



3. Remarks 



An inspection of the proof of Theorem 1.2 allows one to establish a strengthened 
version in which the numbers \kp ± ja l \ are not only composite, but they also 
contain at least two distinct prime factors greater than K. More precisely, the 
cases in which \kp± ja l \ is the product of a prime power q b and some primes less 
than or equal to K can be disposed of by suitable variants of Corollary A. 2 (and 
in the case b > 2, the total contribution here is 0(y/N) which is easily discarded); 
we omit the details. Recently in [18], it was shown that one can in fact ensure that 
the numbers kp±ja l contain » (log log N) 1 ^ 3 " 6 prime factors each for any fixed 
e. 

In a somewhat different direction, it should also be possible to strengthen the 
conclusion of Theorem 1.2 to assert that \kp ± ja l + l\ is composite for all I in 
some set L = Ln C {— KN, . . . ,KN} of cardinality at most K. A new difficulty 
arises here due to an additional factor of rip|±ja i +(-p/w(l ~ p) _1 ar i sm § horn the 
use of Corollary A. 2, but it seems likely that this quantity should be bounded for 
the overwhelming majority of values of a, I, which should allow one to continue 
the argument; we will not pursue this matter here. If one is able to carry out this 
generalisation, one should be able to obtain the conclusion that for any base a > 2 
and any r > 1, a positive proportion of the primes p have the property that if 
one modifies any single one of its digits in the base a expansion, and appends or 
deletes up to r digits to the end and/or beginning of the digit string, one necessarily 
obtains a composite number. 

In a similar spirit, it was recently established in [11] that there exist infinitely many 
composite numbers coprime to which remain composite after inserting a single digit 
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in their base 10 expansion. It seems likely that one should now also be able to find 
infinitely many prime numbers with the same property (i.e. they become composite 
after inserting any digit at any place). 

In all of the above results, the total number of possible modifications of the digit 
string remains comparable to \ogp and so the cases in which a number is unex- 
pectedly prime can be handled by the upper bound sieve after performing the 
preliminary sieving to eliminate most of the cases. The problem becomes signif- 
icantly more difficult, however, if one asks that the number p become composite 
after allowing one to modify any two of the digits in the digit string, as the number 
of possible modifications is now comparable to log 2 p. Indeed, standard heuristics 
from the prime tuples conjecture [8] now lead one to predict that for a sufficiently 
large base, there should only be finitely many numbers of this form, although there 
is a slim chance (especially in small bases) that Mersenne-type primes provide 
enough congruences to fully cover all the modifications for primes in a certain in- 
finite arithmetic progression, as was the case with Theorem 1.1. We remark that 
in [23] it was shown that there are infinitely many integers n such that n — 2 a — 2 b 
is not a prime power for any a, b (an earlier result in [4] establishes the weaker 
statement with "prime power" replaced by "prime"). The base 2 was generalised 
to other bases recently in [2] , and lower bounds on the density of such integers was 
obtained in [2] and [17] (the latter result using the methods in this paper). 

Using the circle method and bounds on prime exponential sums, there are several 
further results known relating primes to binary digits, or to powers of 2. For 
instance, in [9] the distribution of a bounded number of fixed digits of a large prime 
was studied. In [13] it was shown that the binary digit sum of a large prime was 
equally likely to be even as it was to be odd. In a slightly different direction, it was 
shown in [10] that all sufficiently large even numbers are the sum of two primes, 
together with at most 13 powers of two. 



Appendix A. Some sieve theory 



We recall the following standard application of the Selberg sieve to twin prime type 
problems: 

Theorem A.l (Selberg sieve upper bound). Suppose that y > A, and let P := 
Y\ p< ^yP- Let B{p) be the union of b(p) arithmetic progressions with common dif- 
ference p, and put B := {J p i P B(p). If b(2) < 1 and b(p) < 2 for p > 2, then the 
number of integers < r < y such that r £ B is 

«^TK 1 -— ki-V 
lo s y P \p p p 



Proof See [14, Theorem 3.13]. As shown in that reference, one can in fact replace 
the implied constant with 8 + 0( io f^°y V ), but we will not need this improvement 
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Corollary A. 2. Let x, W, b, > 1 be integers with W even, and let h, k be non-zero 
integers. Then if x is sufficiently large depending on W, b, we have 

#{0 < to < x : to = b mod W; to, \km + h\ both prime} 



where the implied constant can depend on k. 

Proof By reversing the signs of k and h if necessary, and increasing the size of 
the implied constant by a factor of 2 if necessary, we may replace \km + h\ by 
km + h. We may assume that b and kb + h are both coprime to W, otherwise the 
number of m for which km, km + h are both prime is bounded uniformly in x and 
the claim is trivial. For similar reasons we may assume that k and h are coprime. 
Write m — Wr + b and y := x/W, thus < r < y. We can restrict attention to 
the case r > ^fy, since the case r < ^fy only contributes 0{^/y) elements which 
is acceptable. If p < ^/y is a prime, then the constraints that m and m + h both 
be prime force Wr + b and kWr + kb + h to both be coprime to p. If p\W, then 
this condition is vacuous; if p\h, pjW , and p\k, then this excludes one residue class 
modulo p from the space of possible r's; and if p\h, p\W and p\k then this excludes 
two residue classes modulo p from the space of possible r's. Finally, if pj(W and 
p\k then either one or two residue classes modulo p are excluded. The claim now 
follows from Theorem A.l (note that log x is comparable to log y for x large enough, 
and that Y\ p (l — |)(1 — ^)~ 2 is comparable to 1). ■ 

Corollary A. 3 (Brun's theorem). Let m,j be any positive integers. Then the sum 
of reciprocals of the primes p for which mp + j is also prime is convergent. 

Proof By Corollary A. 2, the number of primes of the above form which are less 
than x is O( lo ^ x ) (where the implied constant can depend on to). The claim easily 
follows. ■ 
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